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Abstract 

Interest in problems of statistical inference connected to measurements of quantum 
systems has recently increased substantially, in step with dramatic new developments in 
experimental techniques for studying small quantum systems. Furthermore, theoretical 
developments in the theory of quantum measurements have brought the basic mathemat- 
ical framework for the probability calculations much closer to that of classical probability 
theory. The present paper reviews this field and proposes and interrelates a number 
of new concepts for an extension of classical statistical inference to the quantum con- 
text. (An earlier version of the paper containing material on further topics is available as 
quant-ph/0307189). 
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1 Introduction 



Quantum mechanics has replaced classical (Newtonian) mechanics as the basic paradigm for 
physics. From there it pervades chemistry, molecular biology, astronomy, cosmology,. . . . The 
theory is fundamentally stochastic: the predictions of quantum mechanics are probabilistic. 
When used to derive properties of matter, the stochastic nature of the theory is typically swal- 
lowed up by the law of large numbers (very large numbers, like Avogadro's, 10^^). However, in 
some situations randomness does appear on the surface, most familiarly in the random times 
of clicks of a Geiger-counter. Present-day physicists, challenged by the fantastic theoretical 
promise of a quantum computer, are carrying out experiments in which half a dozen ions 
are held in an ion-trap and individually pushed into lower or higher energy states, and into 
quantum superpositions of such joint states. The existence of these wavelike superpositions of 
combinations of distinct states of distinct objects is a fundamentally quantum phenomenon 
called entanglement. Entanglement is of enormous importance in quantum computation and 
quantum communication. In other experiments, using supercooled electric circuits, billions of 
electrons behave as a single quantum particle which is brought into a wavelike superposition 
of macroscopically distinct states (cloc kwise and anti-cloc kwise current flow, for instance). 
This was recently achieved in Delft by Mooii et al. (Il999l 'l using a SQUID (semiconducting 
quantum interference device). iHannemann et al.l (|2nn2l ) recentlv implemented a Bayesian se- 
quential adaptive design-and-estimation procedure to determine the state of 12 identically 
prepared two-level systems. 

In these experiments, single quantum systems are individually manipulated and probed. 
The outcomes of measurements are random, with a probability distribution which depends 
on the one hand, on which quantum measurement (the experimental design) was carried out, 
and on the other hand, on the state of the quantum system being measured. If one does not 
know in advance the state of the quantum system, or wants to use the measurement results in 
order to prove that a certain state had been created, one is dealing with statistical estimation 
and testing problems for data from a probabilistic model with a rather elegant mathematical 
structure, as we shall see. 

By the nature of quantum mechanics, measurement of a quantum system disturbs the 
system. The complete specification of a particular experiment tells us not only how the 
distribution of the data depends on the state of the quantum system being measured, but 
also how the state of the system after the measurement depends on its initial state and on the 
outcome which was observed. This complete specification is described mathematically by a 
quantum instrument. Measuring the system in one way precludes measuring it simultaneously 
in a different way. The total amount of information which can be obtained about an unknown 
parameter of the state of a quantum system is finite. Quantum physics delineates in a very 
precise way the class of all possible instruments. Thus, before looking at which experiments are 
practically feasible, one can already investigate mathematically the limits of the information 
which can be extracted from an unknown quantum system, leading to advice on various 
experimental strategies. 

The field of quantum statistical inference studies these problems in a unified and sys- 
temat ic wa y. Established a quarter of a century ago in the classic monographs of Helstroinl 
(197i) and lHolevol (|l982l ^. it is currently under vigorous renewed development, stimulated by 
experimental efforts in nanotechnology, and the rapid theoretical development of quantum 
communication, quantum cryptography, quantum computation, and quantum information 
theory. 



3 



Though real laboratory experiments involve highly complex models and severe practical 
limitations, the basic theory and the basic statistical issues should be accessible to a general 
statistical audience. The most elementary models involve 2x2 complex matrices, some linear 
algebra and elementary probability. Such models already allow one to state problems of 
statistical design and inference which we are only just starting to understand, and which are 
relevant to experimentalists and theoreticians in quantum information. 

The purpose of this paper is to introduce this problem area to the statistical community. 
We set up the basic statistical modelling in the simplest of settings, namely that of a small 
collection of two-dimensional quantum systems. Depending on the context, such quantum 
systems are called 'spin-half systems' (the spin of an electron, for instance), or 'two-level 
atoms' (ground state and first excited state for atoms in an ion trap, at very low temperature), 
or 'qubits' (the 'bits' of the RAM of a future quantum computer for which various technologies 
are being currently explored; one possibility being a supercooled aluminium ring in which an 
electric current might flow clockwise or anti-clockwise). Also covered is the polarisation 
of photons , leading us to ph enomena stud i ed in quantum optics such as violation of the 
Bell (Eiei) inequalities in the AsT^ect et all (jlflsd l experiment, of great current interest; see 



Weihs et al. (1998),, Gill (2 003, ). Thus the same mathematical and statistical modelling covers 



a multitude of applications. 
1.1 Overview 

The paper is organised as follows. Section |^ describes the mathematical structure linking 
states of a quantum system, possible measurements on that system, and the resulting state 
of the system after measurement. Section |31 introduces quantum statistical models and no- 
tions of quantum score and quantum information, parallel to the score function and Fisher 
information of classical parametric statistical models. In Section 0] we introduce quantum 
exponential models and quantum transformation models, again forming a parallel with fun- 
damental classes of models in classical statistics. In Section we describe notions of quantum 
exhaustivity, quantum sufficiency and quantum cuts of a measurement, relating them to the 
classical notions of sufficiency and ancillarity. We next turn, in Section El to a study of the 
relation between quantum information and classical Fisher information, in particular through 
Cramer-Rao type information bounds. In Section Owe discuss the interrelation between clas- 
sical and quantum probability and statistics. Finally, in Section [H] we conclude with remarks 
on further topics of potential interest to probabilists and statisticians. Sections El and El 
contain a considerable amount of new work. 

This paper complements our more mathematical survey ( Barndorff-Nielsen. Gill, and Juppi 



l2001al ) on quantum statistical information. A version of this paper with much further material 
(such as foundational questions. Bell inequalities, infinite dimensional spaces, continuous tirn e 
observation of a quantum system) is available as Barndor ff-Nielsen . Gill, and Judd ('2001bV 



Many further details can be found in iBarndorff-Nielsen. Gill, and Jupd (200,3 ) . Gill (2001a,) 
is a tutorial introduction to the basic modelling. while lGilll (|2001bl ) is an introduction to large 
sample quantum esti mation theory. Some genera l reference s whic h we have found extremel y 



sample quantum esti mation tneory. bome genera l reierence s wnic n we nave loung extremel y 
useful are the books of lTshamI (jl 99,4 ) . IPere 99,^ . iGilmorfJ (| l 994^ and'Holevo' ('l982', '2001a"). 



The reader is also referred to the 'bible of quantum information' .Nielsen and Chuang. (2000,), 
which contains excellent introductory mater ial on the physic s and the computer science, and 
to the basic probability and statistics text William^ (2001) which recognises (Chapter 10) 



quantum probability as a topic which should be in every statistician's basic education. Fi- 



4 



nally, the former Los Alamos National Laboratory preprint service for quantum physics, now 
at Cornell, quant -ph at http://arXiv.org is an invaluable resource. 



2 States, Measurements and Instruments 
2.1 The Basics 

The state of a finite-dimensional quantum system is described or represented by a d x c? 
matrix p of complex numbers, called the density matrix. The number d is the dimension 
of the system and already the case d = 2 is rich both in mathematical structure and in 
applications, some of which were mentioned above. We shall write Tt = for the Hilbert 
space of d-dimensional complex vectors, also called the state space of the system. The inner 
product of vectors and V"; written by physicists as {(f)\ip) and by mathematicians as (f)*il^, 
equals '^(piipi (the bar denotes complex conjugation). The length or norm of a vector is 
defined through = (<^|0). 

A density matrix p has to be nonnegative and of trace 1, these properties being defined 
as follows. The trace of a square matrix is defined in the usual way as the sum of its diagonal 
elements. The definition of nonnegativity is a little more complicated. First, for an arbitrary 
complex matrix X we define the adjoint X* of X to be the matrix obtained from X by taking 
its transpose and replacing each element by its complex conjugate. An element il^ of the state 
space H is to be thought of as a column vector and hence ^* is a row vector containing the 
complex conjugates of the elements of ip. Since p is a dx d matrix, the quadratic form ifj* pip 
is a complex scalar. The statement that p is nonnegative means just that ip* pip is real and 
nonnegative for every ip &Ti. 

Physicists would write for the column vector ip, {ip\ for the row vector ijj*, and {ijj\p\ip) 
for the number ip*pip. In particular, (V'lV') = IIV'lP is a number, while if Hi/iH = 1 then 
is the matrix which projects onto the one-dimensional subspacc of 7i spanned by ip. This 
bra-ket notation, due to Dirac, appears at first sight merely to require superfiuous typing but 
it does gives a visual clue to the status of various objects and moreover provides a short-hand 
whereby the name of the bra or ket ip can be replaced by some identifying words or symbols, 
as in |e), I©). 

It can be shown that a nonnegative matrix is automatically self-adjoint, i.e. p = p* . Self- 
adjoint matrices share some familar properties of symmetric real matrices: one can find an 
orthonormal basis of eigenvectors, and the eigenvalues are real numbers. The trace of a matrix 
equals the sum of the eigenvalues. A nonnegative matrix has nonnegative eigenvalues. Thus 
the eigenvalues of p can be interpreted as a probability distribution over {1, . . . As we 
shall see, this probability distribution has a physical meaning: the state p can be thought 
of as a probability mixture over a collection of d states, each associated with one of the 
eigenvectors and of a special type called a pure state. A probability mixture p = "^^PiPi of 
density matrices is again a density matrix. This is the state obtained by taking the quantum 
system in state pi with probability pi. 

Example 1 (Pure state and mixed state). Let \(pi), denote an orthonormal 

basis of H. If the basis is clear from the context, we can exploit the bra-ket notation and 
abbreviate these vectors to |1), . . . , Let pi, . . . , pd, denote a probability distribution over 
{!,..., d}. Write 




(1) 
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Note that pi = is a d x d matrix of rank one. It represents the operator which projects an 
arbitrary vector into the one-dimensional subspace of 7i consisting of ah (complex) multiples 
of One can easily check that it is a density matrix. Such a state, with density matrix 
being a rank-one projector and characterised by a unit length state vector in Ti, is a pure 
state. It follows that p, a probability mixture of density matrices, is also a density matrix. 
By the eigenvalue-eigenvector decomposition of self-adjoint matrices, any density matrix can 
be written in the form with the vectors \i) orthonormal. 

If a density matrix p is of rank 1, one can write p = \4>){4>\ for some vector \(f)) with 
II0IP = {(t)\(j)) = 1. The state is called a pure state and is called the state vector; it is 
unique up to a complex factor of modulus 1. If the rank of a density matrix is greater than 1 
then the state is called mixed. It can be written as a mixture of pure states in many different 
ways, especially if one does not insist that the state vectors of the pure states are orthogonal 
to one another. □ 

The density matrix of a quantum system encapsulates in a very concise but rather abstract 
way all the predictions one can make about future observations on that system, or more 
generally, all results of interaction of the system with the real world. 

So far we have been using the word 'measurement' in a rather loose way, but at this 
point it is important to make the technical distinction between mathematical models for a 
measurement when we do not care about the state of the system after the measurement, hut 
only about the outcome^ and models for a measurement including the the state of the system 
after the measurement. The former is called a measurement and denoted by M; the latter, 
more complicated object, is called an instrument and denoted by N . 

Let us start with the simpler object, a measurement. Consider a measurement with 
discrete outcome, i.e. the sample space of the outcome is at most countable. From quantum 
theory it follows that any measurement whatsoever, i.e. any experimental set-up, is described 
mathematically by a collection M oi d x d matrices m{x) indexed by the outcomes x of the 
experiment. The matrices have to be nonnegative (and hence also self-adjoint) and must 
add up to the identity matrix 1. Let us write p{x;p,M) for the probability that applying 
the measurement M to the state p produces the outcome x. Then we have the fundamental 
formula 

p{x;p,M) = trace(pm(x)). (2) 

One can see that this expression indeed defines a bona-fide probability distribution as fol- 
lows. Writing p = Pi\4>i) {(f>i\ and permuting cyclicly the elements in a trace of a product of 
matrices, one finds trace(/9m(j;)) = ^pjtrace(|(/)j)((/)j|m(x)) = ^■pj((/)j|?7i(x)|0j). Thus, since 
m{x) is a nonnegative matrix and the pi are probabilities, p{x; p, M) is a nonnegative real num- 
ber. Moreover, the sum over x of these numbers is trace(pm(x)) = trace(/9 m(a;)) = 
trace(pl) = trace(/9) = 1. 

A quantum statistical model is a model for a partly or completely unknown state. This 
means that the state p is supposed to depend on an unknown parameter in some parameter 
space O. Write p = {p(9) : 9 & Q). When we apply a measurement M to a quantum system 
from this model, the outcome has probability density 

p{x;e,M) = tmce{p{e)m{x)). (3) 

Thus given the measurement and the quantum statistical model, a classical statistical infer- 
ence problem is defined. Very important problems also arise when the measurement itself is 
indexed by an unknown parameter, but for reasons of space we do not address these here. 
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In principle, any measurement M whatever could be implemented as a laboratory exper- 
iment. Equation © tells us implicitly how much information about 9 can be obtained from 
a given experimental set-up M. One may try to choose M in such a way as to maximise the 
information which the experiment will give about 6. Such experimental design problems are 
a main subject of this paper. 

Often we are interested also in the state of the system after the measurement. In this 
case we need the more general notion of instrument. An instrument M (more precisely, a 
'completely positive instrument') is represented by a family of collections oi d x d matrices 
nj(x) satisfying = 1 but being otherwise completely arbitrary. The index x 

refers to the observed outcome of the measurement, the index i could be thought of as 'missing 
data'. Define m{x) = ni{x)*ni{x). It follows that the matrices m{x) are nonnegative (and 
self-adjoint) and add to the identity matrix, and thus represent a measurement (in the narrow 
or technical sense) M. When we apply the instrument M to the quantum system, the outcome 
has the same probability density as ((21) , but we write it out in terms of M as 



and the state of the system after applying the measurement, conditioned on observing the 
outcome x, is 



The reader should check that the expression for a{x\p,J\f) does define a bona-fide density 
matrix (nonnegative, trace 1). In some important practical problems the instrument itself 
depends on an unknown parameter, but here we suppose it is completely known. 

It follows from quantum physics that whatever one can do to a quantum system has to 
have the form of a quantum instrument. Moreover, in principle, any quantum instrument 
whatsoever could be realised by some experimental set-up. Usually in the theory one starts 
by postulating some natural physical properties of the transformation from input or prior 
state to output or posterior state and data, and derives (jlj) and ©, which are then called 
the Kraus representation of the instrument, as a theorem. Here it is more convenient to start 
with Q and Q. Further discussion and references are given in section EIZI 

One could consider applying two different quantum instruments, one after the other, to the 
same quantum system. One might even allow the choice of second instrument to depend on 
the outcome obtained from the first. The composition of two instruments in this way defines 
a new one; it is not difficult to express the matrices ni{x) of the new instrument in terms of 
those of the old ones. Another way to get new instruments from old is by coarsening. Suppose 
one applies one instrument to a quantum system, then applies a many-to-one function of the 
outcome, and discards the original data. The new instrument can be written down in terms of 
the old by relabelling the matrices ni{x) with new index j and variable y in obvious fashion. 

In classical statistics, central notions such as sufficiency are connected to decomposing sta- 
tistical models into parts (marginal and conditional distributions), and to reducing statistical 
models by reducing data. Starting with a quantum statistical model with density matrices p 
depending on a parameter 0, possibly with nuisance parameters too, it is now natural to ask 
whether notions akin to sufficiency and ancillarity can be developed for instruments. For in- 
stance, it might happen that the posterior state of a quantum system after applying a certain 
instrument no longer depends on the unknown parameter. 




(4) 




(5) 
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In the next subsection we shall work out many of these notions for the important special 
case of a two-dimensional quantum system. But first we present two special examples, con- 
necting the notion of instrument to the classical notions (in quantum physics) of observables 
and unitary transformations. 

Example 2 (Simple instruments, simple measurements). Let xi, denote d 

distinct real numbers and let j^'a;), x G {xi . . . ,X(i} be an orthonormal basis of H indexed by 
the numbers in X = {xi, . . . , Xd}- We can now define an instrument J\f with outcomes in X by 
supposing that the index i takes only one value, let us call it 0, and taking no{x) = \ipx){fpx\- 
This matrix is self-adjoint and idempotent (equal to its square). Therefore the corresponding 
matrices m{x) are given by m{x) = \ijjx){i^x\ too, and they sum to the identity matrix: the 
sum of projectors onto orthogonal one-dimensional subspaces spanning the whole space, is 
the identity. This shows that J\f is indeed an instrument, though of very special form indeed. 

We can now compute the probability of observing the outcome x and the posterior state 
of the quantum system, given the outcome is x, when the quantum system is originally in the 
state p = '^iPi\4'i){4'i\- A straightforward calculation shows that they are given as follows: 

i 

a{x;p,M) = \A){'4^x\- (7) 

These formulae can be interpreted probabilistically as follows. The quantum system was 
initially in the pure state with density matrix with probability pi. On being measured 

with the instrument M, the system jumped to the pure state with density matrix \i^x){'4'x\ 
producing the outcome x, with probability \{il>x\4'i)\'^ ■ 

Let X = Ylx^\'^x)i'^x\- "^^^^ is a self-adjoint matrix with eigenvalues xi,...,Xd and 
eigenvectors IV'i), ■ ■ ■ , \ipd)- One says that the instrument corresponds to the observable X. 
'Measuring the observable' with this instrument produces one of the eigenvalues, and forces 
the system into the corresponding eigenstate. If the quantum system starts in a pure state 
with state vector 0, then it jumps to the eigenstate tpx with probability KV'o;!^)!^- 

Suppose now X is an arbitrary self-adjoint matrix. Let X = {xi, . . . , x^'} denote its 
distinct eigenvalues. Let 11 (x) denote the matrix which projects onto the eigenspace corre- 
sponding to eigenvalue x, not necessarily one-dimensional. Thus X = '^^^^i^)- Define 
no(x) = m{x) = Il{x). We see again that the matrices no{x) define an instrument M, and 
the matrices m{x) define a corresponding measurement M. When this instrument is ap- 
plied to the quantum system p = ^iPi\'pi){4'i\, one obtains the outcome x with probability 
^^Pi||n(x)|0i)|p. One may compute that the final state is the mixture, according to the 
posterior probabilities that the initial state was given that the outcome is x, of the pure 
states with state vectors equal to the normalised projections II(a;)|(/),j)/||n(.x)|(/),j)||. Yet again 
we have the probabilistic interpretation, that with probability pi the quantum system started 
in the pure state with state vector On measuring the observable X, the state vector is 
projected into one of the eigenspaces, with probabilities equal to the squared lengths of the 
projections. One gets to observe the corresponding eigenvalue. The posterior state is the 
mixture of these different pure states according to the posterior distribution of initial state 
given data x. 

When one measures the observable X = xll{x) with the corresponding simple instru- 
ment or simple measurement, the probability of each eigenvalue x is trace(pn(a;)). It follows 
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that the expected value of the outcome is trace(pX). More generally, let / be some real 
function. One may define the function / of the observable X hy Y = f{x)Il{x). This is 
the self-adjoint matrix with the same eigenspaces, and with eigenvalues equal to the function 
/ of the eigenvalues oi X. If the function / is many-to-one then some eigenspaces may have 
merged — consider the function 'square' for instance. It follows that the expected value of the 
function / of the outcome of measuring X is given by the elegant formula trace(p/(X)). We 
call this rule the law of the unconscious quantum physicist since it is analogous to the law of 
the unconscious statistician, according to which the expectation of a function Y = f{X) of 
a random variable X may be calculated by an integration (i) over the underlying probability 
space, (ii) over the outcome space of X, (iii) over the outcome space of Y. Note however that 
the simple instruments corresponding to X and to Y are different, and moreover neither is 
equal to the instrument 'measure X, but record only y = f{x)\ 

This calculus of expected values of (outcomes of measuring) observables is the basis of the 
mathematical theory called quantum probability; for some further remarks on this see Section 

m 

Two observables P, Q are called compatible if as operators they commute: PQ = QP. 
A celebrated result of von Neumann is that observables Q and P are compatible if and only 
if they are both functions f{R), giR) of a third observable R. Taking R to have as coarse 
a collection of eigenspaces as possible, one can show that the results of the following three 
instruments are identical: the simple instrument for Q followed by the simple instrument for 
P, recording the values qoi Q and p of P; the simple instrument for P followed by the simple 
instrument for Q, recording the values q oi Q and p of P; and the simple instrument for i?, 
recording the values q = f{r) and p = g{r) where r is the observed value of R. It follows that 
the probability distribution of the outcome of measurement of an observable P is not altered 
when it is measured (simply, jointly) together with any other compatible observables. 

An instrument such that the index i takes only one value, say 0, and such that all no(x) 
are projectors onto orthogonal subspaces of TL, together spanning the whole space, is called a 
simple instrument. The corresponding measurement is called a simple measurement. Simple 
instruments and measurements stand in one-to-one correspondence with observables. The 
rule for the transformation of the state under a simple instrument is called the Liiders-von 
Neumann projection postulate. □ 

Example 3 (Instrument with no data). It is possible that the quantum instrument M 
transforms the quantum system p without actually producing any outcome x: in the definition 
of an instrument, simply take the outcome space to consist of a single element, let us call it 
0. Then the state p is transformed by the instrument into the state nj(0)pnj(0)* where 
the ni{0) are matrices satisfying ni{0)* ni{0) = 1. A very special case is obtained when 
there is also only one value of the index i and the instrument is defined by a single matrix 
U = no(0) satisfying U*U = UU* = 1. State p is transformed into UpU* . Such a matrix U is 
called unitary and it corresponds to an orthogonal change of basis. A unitary transformation 
is invertible, and corresponds to the reversible time evolution of an isolated quantum system; 
see Subsection 12.41 below. □ 

2.2 Spin-half 

Our examples will concern mainly the spin of spin-half particles, where the dimension d of 
7^ is 2. Unfortunately, it would take us too far afield to explain the significance of the word 
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half. The classic example in this context is the 1922 experiment of Stern and Gerlach, see 
Brandt and DahmenI (|l99,4 Section 1.4), to determine the size of the magnetic moment of the 



electron. The electron was conceived of as spinning around an axis and therefore behaving as 
a magnet pointing in some direction. Mathematically, each electron carries a vector 'magnetic 
moment'. One might expect the sizes of the magnetic moment of all electrons to be the same, 
but the directions to be uniformly distributed in space. Stern and Gerlach made a beam of 
silver atoms move transversely through a steeply increasing vertical magnetic field. A silver 
atom has 47 electrons but it appears that the magnetic moments of the 46 inner electrons 
cancel and essentially only one electron determines the spin of the whole system. Classical 
physical reasoning predicts that the beam would emerge spread out vertically according to 
the component of the spin of each atom (or electron) in the direction of the gradient of 
the magnetic field. The spin itself would not be altered by passage through the magnet. 
However, amazingly, the emerging beam consisted of just two well separated components, as 
if the component of the spin vector in the vertical direction of each electron could take on 
only two different values, which in fact are ±^ in appropriate units. 

This example fits into the following mathematical framework. Take d = 2, then Ti = 
and p is a 2 X 2 matrix 

with P21 = Pi2 ^iid pii and p22 real and nonnegative and adding to 1. The matrix has 
non-negative real eigenvalues pi and p2 also adding to 1. 

In this case the density matrices of pure states can be put into one-to-one correspondence 
with the unit sphere 5^, the surface of the unit ball in real, 3-dimensional space. Directions 
in the sphere correspond to directions of spin. This geometric representation is known in 
theoretical physics as the Poincare sphere, in quantum optics as the Bloch sphere, and in 
complex analysis as the Riemann sphere. The mixed states, i.e. the convex combinations 
of pure states, correspond to points in the interior of the ball. The mapping from states 
(matrices) to points in the unit ball is affine, as we shall now show. 

Any real linear combination of self-adjoint matrices is again self-adjoint. Since the 2 
diagonal elements of a self-adjoint matrix must be real, and the 2 off-diagonal elements are 
one another's complex conjugate, just 4 real parameters are needed to specify any such matrix. 
By inspection one discovers that the space of self-adjoint matrices is spanned by the identity 
matrix 



1 = 0-0 

together with the three Pauli matrices 



1 
1 



1\ (0 -i\ (\ 



\ ^) \i J " VO -1 

Note that (Tx,cry and (t^ satisfy the commutation relations 

[o'x,cry] = 2iaz 

[crz,o-x] = 2iay, 
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where, for any operators A and B, their commutator [^4, B] is defined as AB — BA. Note also 
that 

2 2 2 1 

Any pure state has the form | for some unit vector 1-0) in C^. Up to a complex factor 

of modulus 1 (the phase, which does not influence the state), we can write \^) as 

sin(e/2) J ' 

The corresponding pure state is 

cos2(^/2) e-*'^cos(^/2)sin(^/2) 
'cos(^/2)sin(0/2) sin2(^/2) 

A little algebra shows that p can be written as p = {1 -\- UxCFx + UyCTy + UzCr z) /2 = ^(1 + u • a), 
where a = {ax,o-y, Gz) are the three Pauli spin matrices and u = (ux,Uy, Uz) = u{0,(j)) is the 
point on the unit sphere with polar coordinates {0,(f)). 

An arbitrary mixed state is obtained by averaging pure states p = ^{1 + u-a) with respect 
to any probability distribution over real unit vectors u. The result is a density matrix of the 
form p = i(l + a-(T), where a is the centre of mass (a point in the unit ball) of the distribution 
of pure states seen as a distribution over the unit sphere. The coordinates of a are called the 
Stokes parameters when we are using this model to describe polarization of a photon, rather 
than spin of an electron. 



2.3 Superposition and Mixing 

Given two state vectors and |02) and two complex numbers ci, C2, the state vector 
(ci|0i) + C2|02))/||ci|^i) + C2|^2)|| is called the quantum superposition of the original two 
states, with complex weights ci, C2- This is a completely different way of combining two 
states from the mixture pi|0i)(0i|+p2|?^>2)(02|- (Sometimes the latter is called an 'uncoherent 
mixture' and the former a 'coherent mixture'.) For example, consider the case d = 2, let \4>i) 
and 102) form an orthornormal basis of 7^ = C^, and suppose ci = C2 = l/-\/2, Pi = P2 = 1/2. 
We consider the equal weights superposition and the equal weights mixture of the pure states 
and 102), showing how some measurements are able to distinguish between these states, 
whereas others do not. 

The two matrices m(l) = |0i)(0i|, m(2) = |02)(02| define a measurement M"^ with two 
possible outcomes 1 and 2, say. The probability distributions of the outcome under the 
superposition and under the mixture just defined are identical (probabilities 1/2 for each of 
the outcomes 1 and 2). 

Define now a new orthonormal basis = (|0i) + |02))/\/2, IV'i) = (I0i) ~ |02))/\/2- 
Corresponding to this basis, one can construct a measurement in the same way as before. 
When the superposition is measured with M^, the outcome 1 is certain and the outcome 2 is 
impossible. However, when the mixture is measured with iVf the two outcomes have equal 
probability 1/2. 

It is very important to note that a pure state can be expressed as a superposition of others, 
and a mixed state as a mixture of others, in many different ways. 
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2.4 The Schrodinger Equation 

Typically the state of a quantum system undergoes an evolution with time under the influence 
of an external field. The most basic type of evolution is that of an arbitrary initial state po 
under the influence of a field with Hamiltonian H. This takes the form 

where pt denotes the state at time t, h = 1.05 x 10~^^ J sec is Planck's constant, and H is 
a self-adjoint operator on 7i. If po is a pure state then pt is pure for all t and we can choose 
unit vectors ipt such that pt = \tpt){ipt\ and 

V't = e*^/*Vo. (8) 

Equation Q is a solution of the celebrated Schrodinger equation iH{d/dt)ij: = Hip or equiv- 
alently ih{d/dt)p = [H,p]. The matrix e*^/*^ is unitary. Conversely, every unitary matrix U 
can be written in the form e^^/^^ for some self-adjoint matrix H and some time t and hence 
can be obtained by looking at some Schrodinger evolution at a suitable time. 



2.5 Separability and Entanglement 

When we study several quantum systems (with Hilbert spaces TLi, TLm) interacting 
together, the natural model for the combined system has as its Hilbert space the tensor 
product TLi ® ■ ■ ■ ® Tim- Then a state such as pi • • • Pm represents 'particle 1 in state pi 
and . . .and particle m in state pm'- Suppose the states pi are pure with state vectors iV'j). 
Then the product state we have just defined is also pure with state vector • • • (8) \ipm)- 
A mixture of such states is called separable. 

On the other hand, according to the superposition principle, a complex superposition of 
such state vectors is also a possible state vector of the interacting systems. Pure states whose 
state vectors cannot be written in the product form • • • \ipm) are called entangled. 
The same term is used for mixed states which cannot be written as a mixture of pure product 
states. A state which is not entangled is separable. The existence of entangled states is 
responsible for extraordinary quantum phenomena, which scientists are only just starting to 
harness (in quantum communication, computation, teleportation, etc.). 

An important physical feature of unitary evolution in a tensor product space is that, in 
general, it does not preserve separability of states. Suppose that the state pi P2 evolves 
according to the Schrodinger operator Ut = e''^^^^ on TCi (8) 7^2- In general, if H does not have 
the special form Hi (8 12 + li (8) H2, the corresponding state at any non-zero time is entangled. 
The notorious Schrodinger Cat, is a consequence of_this phenomenon of entanglement. For 
an illustrative discussion of this see, for instance, IshamI ( 19951 . Sect. 8.4.2). 



Consider a product quantum system with density matrix p. On its own, the first com- 
ponent has reduced density matrix pi obtained by "tracing out" the second component, 
{pi)ij = Ylkip)-i-k,jk- This procedure corresponds to computing a marginal from a joint prob- 
ability distribution. Any mixed state can be considered as the reduction to the system of 
interest of a pure state on an enlarged, joint system. For instance, the completely mixed state 
1/d is the result of tracing out the second component from the pure state \j) (8 \j) j^fd on 
the product space formed from two copies of the original space. 
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2.6 Further Theory of Measurements 

Example 4 (Spin-half, cont.). For any unit vector -0 of C^, let V'"'" denote the unit vec- 
tor orthogonal to it (unique up to a complex phase). The observable (self-adjoint matrix) 
2|^/')(^| — 1 = \ijj){'il>\ — l-f/;^) ('(/'"'" I defines a simple instrument. It has eigenvalues 1 and — 1 
and one-dimensional eigenspaces spanned by ip and if)-^. This observable corresponds to the 
spin of the particle in the direction (on the Poincare sphere) defined by ip. When 'the spin is 
measured in this direction' meaning, when this observable is measured, the result (in appro- 
priate units) is either -|-1 or —1. Moreover, after the measurement has been carried out, the 
particle is in the pure state of spin in the corresponding direction. We mentioned two such 
measurements in Section f2. 31 on mixing and superposition. 

In particular, with outcome space X = {—1, 1}, the specification 



X) 



no(+l) = m{+l) = \{l + a., 
no(-l) =m(-l) = i(l-a,) 

defines a simple instrument (where the index takes only one value). It corresponds to the 
observable ax'- spin in the x-direction. □ 

We next discuss the notion of quantum randomisation, whereby adding an auxiliary quan- 
tum system to a system under study gives one further possibilities for probing the system 
of interest. This also connects to the important notion of realisation, i.e. representing a 
measurement by a simple measurement on a quantum randomised extension. 

Suppose given a Hilbert space TL, and a pair (/C, pa), where /C is a Hilbert space and pa is 
a state on /C. Any measurement M on the product space 7i® fC induces a measurement M 
on Ti by the defining relation 

trace {pm{x)) = trace {{p ® pa)rh{x)) for all states p on Ti, all outcomes x . (9) 

The pair {IC,pa) is called an ancilla. The following theorem (Holevo's extension of Naimark's 
Theorem) states that any measurement M onTC has the form ® for some ancilla {IC, pa) and 
some simple measurement M on /C. The triple {KL, pa,M) is called a realisation of M 
(the words extension or dilation are also used sometimes). Adding an ancilla before taking a 
simple measurement could be thought of as quantum randomisation. 

Theorem 1 ( Holevol 1982 ^. For every measurement M on TL, there is an ancilla {IC,pa) 



and a simple measurement M on7i®lC which form a realisation of M . 

We use the term 'quantum randomisation' because of its analogy with the mathemati- 
cal representation of randomisation in classical statistics, whereby one replaces the original 
probability space with a product space, one of whose components is the original space of 
interest, while the other corresponds to an independent random experiment with probabil- 
ities under the control of the experimenter. Just as randomisation in classical statistics is 
sometimes needed to solve optimisation problems of statistical decision theory, so quantum 
randomisation sometimes allows for strictly better solutions than can be obtained without it. 

Here is a simple spin-half example of a non-simple measurement which cannot be repre- 
sented without quantum randomisation. 

Example 5 (The triad). The triad, or Mercedes-Benz logo, has an outcome space consisting 
of just three outcomes: let us call them 1, 2 and 3. Let Vi, i = 1,2, 3, denote three unit vectors 
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in the same plane through the origin in R^, at angles of 120° to one another. Then the matrices 
m{i) = ^{1 + Vi ■ a) define a (non-simple) measurement M on the sample space {1,2,3}. It 
turns up as the optimal solution to the decision problem: suppose a spin-half system is 
generated in one of the three states pi = ^(1 — Vi ■ a), i = 1,2,3, with equal probabilities. 
What decision rule gives the maximum probability of guessing the actual state correctly? 
There is no way to equal the success probability of M if one restricts attention to simple 
measurements, or to classically randomised procedures based on simple measurements. □ 

Finally, we introduce some further terminology concerning measurements. Given a mea- 
surement M and a function T from its outcome space A" to another space y, one can define 
a new measurement M' = M o with outcome space y. It corresponds to restricting at- 
tention to the function T of the outcome of the first measurement M. We call it a coarsening 
of the original measurement, and we say that M is a refinement of M'. 

So far we have restricted attention to measurements with discrete outcome space. More 
generally, one considers measurements with outcomes in an arbitrary measure space {X,A) 
where .4 is a sigma-algebra of measurable subsets of X. Such measurements are defined 
by a collection of matrices M{A) which are nonnegative, sigma-additive over A, and such 
that M{X) = 1. The probability that the outcome lies in the set ^ G ^ is trace{pM{A)). A 
measurement M is called dominated by a (real, sigma-finite) measure v on the outcome space, 
if there exists a non- negative self-adjoint matrix- valued function m{x), called the density of 
M, such that M{A) = J^m{x)u{dx) for all B. When H is finite dimensional, as in this 
paper, every measurement is dominated: take iy{A) = trace(M(^)). In the dominated case, 
the outcome of the measurement has a probability distribution with density p{x; p, M) = 
trace(pm(x)) with respect to u. If the outcome space is discrete and u is counting measure, 
then these notations are linked to our original setup by m{x) = M{{x}), M{A) = Ylx^A "^(a^)- 

To exemplify these notions, suppose for some dominated measurement M one can write 
m{x) = mi (a;) -|- m,2{x) for two non-negative self-adjoint matrix- valued functions mi and 
m2. Define M' to be the measurement on the outcome space X' = X x {1, 2} with density 
m{x,i) = mi{x), {x,i) G X' with respect to the product of v with counting measure. Then 
M' is a refinement of M . 

We described earlier how one can form product spaces from separate quantum systems, 
leading to notions of product states, separable states, and entangled states. Given a mea- 
surement M on one component of a product space, one can naturally talk about 'the same 
measurement' on the product system. It has components M{A) ® 1. Given measurements 
M and M' defined on the two components of a product system, one can define in a natural 
way the measurement 'apply M and M' simultaneously to the first and second component, 
respectively': its outcome space is the product of the two outcome spaces, and it is defined 
using obvious notation by (M M'){A x A') = M{A) M'{A'). 

A measurement M on a product space is called separable if it has a density m such that 
each m{x) can be written as a positive linear combination of tensor products of non-negative 
components. It can then be thought of as a coarsening of a measurement with density m' 
such that each m'(y) is a product of non-negative components. 

2.7 Further Theory of Instruments 

Just as we want to allow measurements also to take on continuous values, so we need instru- 
ments to do the same. 
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Consider an instrument M with outcomes x in the measurable space {X, A). Let n^dx; p,M) 
denote the probability distribution of the outcome of the measurement, and let a{x; p,M) de- 
note the posterior state when the prior state is p and the outcome of the measurement is x. 
It follows from the laws of quantum mechanics that the only physically feasible instruments 
have a special form, generalising in a natural way the definitions we gave for the discrete case. 
Namely, corresponding to M there must exist a cj-finite measure v on X (which without loss 
of generality, can be taken to be a probability measure) and a collection of matrix-valued 
measurable functions rii of x indexed by a finite or countable index i, such that 

/ ni{x)*ni{x)v{dx) = 1; 
i ''^ 

the posterior states for J\f are given by 

.(.;p.Ar) = ^p"f'7'W; (10) 

2^. trace(/9nj(x)*ni(x)) 

and the distribution of the outcome is 

7r(dx;/9,AA) = trace(/9nj(x)*nj(x))z^(dx). (11) 

i 

These formulae generalise naturally @ and ©. In the physics literature, this kind of repre- 
sentation is often called the Kraus representation of a completely positive instrument. Space 
does not suffice to explain t hese terms, in part i cular ' complet e posit i vity', further. The in- 
terested reader is referred to 'Davies and Lewi;^ ('l970'), 'Davietl ('l976^ 'KrausI (|l98.'j 'l. lOzawal 
(jlQS.'j V lNielsen a,Tid Chu~g. (2nnn,V Loubcngls, (2nni V and Holevo (2001a). 

When the posterior state is disregarded, the instrument N gives rise to the measurement 
M with density m{x) = Y2^ni{x)*ni{x) with respect to the dominating measure ly. Clearly, a 
measurement M can be represented as the 'data part' of an instrument in very many different 



ways. 

Further results of Ozawal ( 1985h generalise the realisability of measurements (Naimark, 



Holevo theorems) to the realisability of an arbitrary completely positive instrument. Namely, 
after forming a compound system by taking the tensor product with some ancilla, the instru- 
ment can be realised as a unitary (Schrodinger) evolution for some length of time, followed 
by the action of a simple instrument (measurement of an observable, with state transition 
according to the Liiders-von Neumann projection postulate). Therefore to say that the most 
general operation on a quantum system is a completely positive instrument comes down to 
saying: the only mechanisms known in quantum mechanics are Schrodinger evolution, von 
Neumann measurement, and forming compound systems. Combining these ingredients in 
arbitrary ways, one remains within the class of completely positive instruments; moreover, 
anything in that class can be realised in this way. 

An instrument defined on one component of a product system can be extended in a natural 
way (similar to that described in Section 12.61 for measurements) to an instrument on the 
product system. Conversely, it is of great interest whether instruments on a product system 
can in some way be reduced to 'separate instruments on the separate sub-systems'. There 
are two important notions in this context. The first (similar to the concept of separability 
of measurements) is the mathematical concept of separability of an instrument defined on 
a product system: this is that each nj(x) in the Kraus representation of an instrument 
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is a tensor product of separate matrices for each component. The second is the physical 
property which we shall call multilocality: an instrument is called multilocal, if it can be 
represented as a coarsening of a composition of separate instruments applied sequentially to 
separate components of the product system, where the choice of each instrument at each 
stage may depend on the outcomes of the instruments applied previously. Moreover, each 
component of the system may be measured several times (i.e. at different stages), and the 
choice of component measured at the nth stage may depend on the outcomes at previous 
stages. One should think of the different components of the quantum system as being localised 
at different locations in space. At each location separately, anythi ng quantum is allowed , but 



at aiiierent locations m space. At eacn location separately, anytm ng quantum is aiiowea , out 
all communication between locations is classical. It is a theorem of lBennett et al. ( 1999f ) that 



every multilocal instrument is separable, but that (surprisingly) not all separable instruments 
are multilocal. It is an open problem to find a physically meaningful characterisation of 
separability, and conversely to find a mathematically convenient characterisation of multi- 
locality. (Note our terminology is not standard: the word 'unentangled' is used by some 
authors instead of 'separable', and 'separable' instead of 'multilocal'). 

Not all joint measurements (by which we just mean instruments on product systems), are 
separable, let alone multilocal. Just as quantum randomised measurements can give strictly 
more powerful ways to probe the state of a quantum system than (combinations of) simple 
measurements and classical randomisation, so non-separable measurements can do strictly 
better than separable measurements at extracting information from product systems, even if 
a priori there is no interaction of any kind between the subsystems. 

3 Parametric Quantum Models and Likelihood 

A measurement from a quantum statistical model (p, m) results in an observation x with 
density 

p{x;9) = tvace{p{6)m{x)) 

and log likelihood 

l{9) = logtrace(p(0)m(a;)) . 

For simplicity, let us suppose that 9 is one-dimensional. It is often useful to express the 
log likelihood derivative in terms of the symmetric logarithmic derivative or quantum score 
of p, denoted by p//g. This is defined implicitly as the self-adjoint solution of the equation 

P/e = P° P//e , (12) 
where o denotes the Jordan product, i.e. 

P ° P//9 = UpP//9 + P//eP) ' 

p/0 denoting the ordinary derivative of p with respect to 9 (term-by-term differentiation in 
matrix representations of p) . (We shall often suppress the argument 9 in quantities like p, p/g, 
PjjQ^ etc.) The quantum score exists and is essentially unique subject only to mild conditions. 
(For a discussion of this see, for example, p. 274 of .Holeva.l982i .) 
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The likelihood score l/e{0) = {d/d6)l{6) may be expressed in terms of the quantum score 
P//e{^) of p{0) as 

= p{x;e)-^tTace{{p{9)p//0{e) + p//e{e)p{9))m{x)) 
= p{x;e)~^^tiaceip{9)p//e{e)m{x)) , 

where we have used the fact that for any self-adjoint matrices P, Q, R and any matrix T the 
trace operation satisfies trace{PQR) = tvace{RQP) and 5Rtrace(r) = ^trace(T + T*). It 
follows that 

E0{l/g{e))=tTace{p{e)p//g{e)). 

Thus, since the mean value of I/q is 0, we find that 

tiace{p{e)p//e{e)) = 0. (13) 
The expected (Fisher) information i{6) = i{6\M) = Eg(//5i(0)^) may be written as 

i{e] M)= I p{x; {^tiace{p{e)p//e{e)m{x)))^ zv(dx) . (14) 



It plays a key role in the quantum context, just as in classical statistics, and is discussed 

in Section El In particular, we shall discuss there its relation with the expected or Fisher 
quantum information 

m=tvace{p{e)p//e{ef). (15) 

The quantum score is a self-adjoint operator, and therefore may be interpreted as an observ- 
able which one might measure on the quantum system. What we have just seen is that the 
outcome of a measurement of the quantum score has mean zero and variance equal to the 
quantum Fisher information. 

4 Quantum Exponential and Quantum Transformation Mod- 
els 

In traditional statistics, the two major classes of parametric models are the exponential models 
(in which the log densities are affine functions of appropriate parameters) and the transfor- 
mation (or group) models (in which a group acts in a consistent fashio n on both the sample 
space and the parameter space) ; see iBarndorff-Nielsen and The intersection of 



these classes is the class of exponential transformation models, and its members have a par- 
ticularly nice structure. There are quantum analogues of these classes, and they have useful 
properties . Below we outline some of thes e briefly . Considerably more discussion is given in 



our works Barndorff-Nielsen et al 



se prieny . 



4.1 Quantum Exponential Models 

A quantum exponential model is a quantum statistical model for which the states p{0) can be 
represented in the form 

p{0) = e-^'^e^^^'^^rp.eh'^s^^^ OeQ, 
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where 7 = (7^,...,7'^) : —>■ C^, Ti,...,Tfc are d x d matrices, po is self-adjoint and 
non-negative (but not necessarily a density matrix), the Einstein summation convention (of 
summing over any index which appears as both a subscript and a superscript) has been used, 
and k{9) is a log norming constant, given by 

k{9) = logtrace(e^^''(^)^'Voe^^'^(^)^0 • 

Three important special types of quantum exponential model are those in which Ti, . . . , 
are self-adjoint (and for the first type, Tq, Ti, . . . ,Tk all commute), and the quantum states 
have the forms 

p{0) = e-'^(^) exp (To + rr,) , (16) 
p{e) = e-'^(^)exp(irT,)/5oexp(ie^T,), (17) 
p{e) = exp(-iirT,.)/3oexp(4e'^r^) , (18) 

respectively, where 9 = {9^, . . . ,9^) G M'^ and po is nonnegative (and self-adjoint), and the 
summation convention is in force. 

We call these three types the quantum exponential models of mechanical type, symmetric 
type, and unitary type respectively. The mechanical type arises (at l east, with fc = 1) in 
Tuantum statistical mechanics as a state of statistical equilibrium, see ICardiner and Zoller 



()2nr)nl . Sect. 2.4.2). The symmetric type has theoretical statistical significance, as we shall 
see, connected among other things to the fact that the quantum score for this model is easy 
to compute explicitly. The unitary type has physical significance connected to the fact that 
it is also a transformation model (quantum transformation models are defined in the next 
subsection). The mechanical type is a special case of the symmetric type when Tq, Ti, . . . ,Tk 
all commute. 

In general, the statistical model obtained by applying a measurement to a quantum ex- 
ponential model is not an exponential model (in the classical sense). However, for a quantum 
exponential model of the form H17|) in which 

Tj = tj{X) j = l,...,k for some self-adjoint X , (19) 

i.e. the Tj commute, the statistical model obtained by applying the measurement X is a full 
exponential model. Various pleasant properties of such quantum exponential models then 
follow from standard properties of the full exponential models. 

The classical Cramer-Rao bound for the variance of an unbiased estimator t of is 

Yav (t) > i{9; M)-^ . (20) 



Combining H20|) with Braunstein and Caved' ( 19941) quantum information bound i{9;M) < 



I{9) (see (|27]) in Section l6?2|) vields iHelstroml 's (jl9m ) quantum Cramer-Rao bound 

Var(t) > I{9)-^ , (21) 

whenever t is an unbiased estimator based on a quantum measurement. It is a classical result 
that, under certain regularity conditions, the following are equivalent: (i) equality holds in 
(|2()|). (ii) the score is an affi ne function of t, (iii) th e model is exponential with t as canonical 
statistic (cf. pp. 254-255 of lCox and Hinkle\lll974h . This result has a quantum analogue, see 
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TheoremsEland^and Corollary^below, which states that under certain regularity conditions, 
there is equivalence between (i) equality holds in ()21() for some unbiased estimator t based on 
some measurement M, (ii) the symmetric quantum score is an affine function of commuting 
Ti, . . . ,rfc, and (iii) the quantum model is a quantum exponential model of type H17|) where 
Ti , . . . , Tfc satisfy (fT!H) . The regularity conditions which we assume below are indubitably too 
strong: the result should be true under minimal smoothness assumptions. 



4.2 Quantum Transformation Models 

Consider a parametric quantum model (p, M) consisting of a parametrised family p = {p{0) : 
6 € Q) of states and a measurement M with outcome space {X, A). Suppose that there exists 
a group, G, with elements g, acting both on X and on Q in such a way that the following 
consistency condition ('equivariance') holds 

tTace{p{e)M{A)) = trace{p{ge)M{gA)) (22) 

for all 9, A and g. If, moreover, G acts transitively on Q then we say that (p, M) is a 
quantum transformation model. In this case, the resulting statistical model for the outcome 
of a measurement of M, i.e. (A",^, P), where V = {tiace{p{6)M) : £ 0), is a classi- 
cal transformation model. Conse quentlv, the Main Theorem for transformation models, see 
Barndorff-Nielsen and Co^ ( 19941 . pp. 56-57) and references given there, applies to {X, A, V). 



Of special physical interest is the case in which the group acts on the states as a group of 
unitary matrices. 

Example 6 (Spin- half: great circle model). Consider the spin-half model p{6) = U ^{1 + 
cosOgx + smOay) U* where C/ is a fixed 2x2 unitary matrix, and ax and ay are two of the 
Pauli spin matrices, while the parameter 6 varies through [0, 27r); see Subsection 12.21 The 
matrix U can always be written as exp{—i(j)u ■ a) for some real three-dimensional unit vector 
u and angle 4>. Considered as a curve on the Poincare sphere, the model forms a great 
circle. If U is the identity (or, equivalently, (j) = 0) the curve just follows the equator; the 
presence of U corresponds to rotating the sphere carrying this curve about the direction u 
through an angle 4>. Thus our model describes an arbitrary great circle on the Poincare 
sphere, parameterised in a natural way. Since we can write p{9) = IJVgU* p{0)UVgU* , where 
the unitary matrix Vo corresponds to rotation of the Poincare sphere by an angle about 
the z-axis, we can write this model as a transformation model. The model is clearly also 
an exponential model of unitary type. Perhaps surprisingly, it can be reparameterised so 
as also to be an exponential model of symmetric type. We leave the details (which depend 
on the algebraic properties of the Pauli spin matrices) to the reader, but just point out 
that a one-parameter pure-state exponential model of symmetric type has to be of the form 
p{9) = exp(— k(^)) exp(i0n ■ a)^{l + v ■ a) ex.p{^9u ■ a) for some real unit vectors u and v, 
since every self-adjoint 2x2 matrix is an affine function of a spin matrix u ■ a. Now write 
out the exponential of a matrix as its power serie s, and use the fact that the sq uare of any 
spin matrix is the identity. This example is due to lFuiiwara and Nag 



5 Quantum Exhaust ivity, Sufficiency, and Quantum Cuts 

This section proposes and interrelates a number of concepts that will constitute, we think, 
essential elements in the development of statistical inference for the quantum context. The 
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concepts are partly in the nature of quantum analogues of key ideas of classical statistical 
inference, such as sufficiency, the likelihood principle, etc. 



5.1 Quantum Exhaustivity 

Those quantum instruments for which no information on the unknown parameter of a quan- 
tum parametric model of states can be obtained from subsequent measurements on the given 
physical system deserve special attention. To simplify notation, we will write a{x;9,J\f) 
instead of cj(x; p{6),M) for the posterior state. We propose the following definition of exhaus- 
tivity: 

Definition 1 (Exiiaustive instrument). A quantum instrument M is exhaustive for a 
parameterised set p = {p{0) : £ Q) oi states if for all 6 in Q and for 7r(-; ^)-almost all x, the 
posterior state a{x;9,M) does not depend on 6. □ 

Thus the posterior states obtained from exhaustive quantum instruments are completely 
determined by the result of the measurement and do not depend on 6. 
A useful strong form of exhaustivity is defined as follows. 

Definition 2 (Completely exhaustive instrument). A quantum instrument M is com- 
pletely exhaustive if it is exhaustive for all parameterised sets of states. □ 

Recall that any completely positive instrument — in other words, any physically realisable 
instrument — has posterior states given by and outcom e distributed as pij) . The following 
Proposition (which is a slight generalisation of a result of WisemanI 19991 ) shows one way of 
constructing completely exhaustive completely positive quantum instruments. 

Proposition 1. Let the quantum instrument M be as above, with ni{x) of the form 

ni{x) = \(l)i,x){ipx\ , (23) 
for some functions {i,x) i-^ (pi^x O'^d x ^ ipx- Then J\f is completely exhaustive. 
Proof. By inspection we find that the posterior state is 

Ei \<t>i,x){(t)i,x\ 



a{x-p,M) 

which does not depend on the prior state p. □ 



5.2 Quantum Sufficiency 

Suppose that the measurement M' = M oT~^ is a coarsening of the measurement M. In this 
situation we say that M' is (classically) sufficient for M with respect to a family of states 
p = {p{6) : £ Q) on 7i \i the mapping T is sufficient for the identity mapping on {X.,A) 
with respect to the family (-P(-; ^; M) : ^ G G) of probability measures on (Af, A) induced by 
M and p. 

As a further step towards a definition of quantum sufficiency, we introduce a concept of 
inferential equivalence of parametric models of states. 
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Definition 3 (Inferential equivalence). Two parametric families of states p = {p{9) : 9 £ 
Q) and a = {(t{9) : G 0) on Hilbert spaces 7i and /C are said to be inferentially equivalent if 
for every measurement M onTC there exists a measurement M' on /C sucli that for all ^ G G 

trace(M(>(0)) = trace(M'(>(0)) (24) 

and vice versa. (Note that, implicitly, the outcome spaces of M and M' are assumed to be 
identical.) □ 

In other words, p and cr are equivalent if and only if they give rise to the same class of 
possible classical models for inference on the unknown parameter. 

Remark 1. It is of interest to find characterisations of inferential equivalence. This is a 
nontrivial problem, even when the Hilbert spaces 7i and /C are the same. □ 

Next, let M denote a quantum instrument on a Hilbert space 7i and with outcome space 
{X,A) and let M' = J\f o T^^ be a coarsening of with outcome space {y^B)^ generated by 
a mapping T from [X^A) to [y^B). It is easy to show that the posterior states for the two 
instruments are related by 

a{t;9,M')= [ a{x;9,M)TT{dx\t;9,M), 

where Tr{dx\t;9,J\f) is the conditional distribution of x given T{x) = t computed from 
TT{dx-9,M). 

Definition 4 (Quantum sufficiency of instruments). Let N' be a coarsening of an 
instrument M hy T : {X ,A) ^ {y, B). Then M' is said to be quantum sufficient with respect 
to a family of states {p{9) : G G) if 

(i) the measurement M' determined by trace(M' (•)/>) = Tr{-; p,J\f') is sufficient for the 
measurement M determined by trace(M(-)p) = 7r(-;/9,AA), with respect to the family 
{p{9) : G G), 

(ii) for any x £ X, the posterior families [a{x;9,J\f) : 9 £ Q) and {a(T{x);9,M') : 9 £ @) 
are inferentially equivalent. 

□ 

5.3 Quantum Cuts and Likelihood Equivalence 

In the theory of classical statistical inference, many important concepts (such as sufficiency, 
ancillarity and cuts) can be expressed in terms of the decomposition by a measurable func- 
tion T : {X,A) ^ {y,B) of each probability distribution on {X,A) into the corresponding 
marginal distribution of T{x) and the family of conditional distributions of x given T{x). We 
now define analogous concepts in quantum statistics based on the decomposition 

P^{TT{--p,M),<y{-,pM)) (25) 

by a quantum instrument M of each state p into a probability distribution on {X,A) and a 
family of posterior states; see Section ITTI 
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The classical concept of a cut encompasses those of sufficiency and ancillarity and is 
therefore more basic. A measurable function T is a cut for a set V of probability distributions 
on X if for all pi and p2 in 7^, the distribution on X obtained by combining the marginal 
distribution of T{x) given by pi with th e family of conditio nal dis tributions of x given T(x) 
given by p2 is also in V; see, e.g. Barndor ff- Nielsen and Qgj (Il994l. p. 38). Recen t results on 



cuts for exponential models can be found in lBarndorff-Nielsen and Koudoul ( 19951 ^. which also 
gives references to the useful role which cuts have playe d in graphical models. A genera l isatio n 



A. genera l isatio n 

to local cuts has become important in econometrics (|Christensen and Kiefeil . Il994l . 2000h . 
Replacing the decomposition into marginal and conditional distributions in the definition of 
a cut by the decomposition yields the following quantum analogue. 

Definition 5 (Quantum cut). A quantum instrument M is said to be a quantum cut for a 
family p of states if for all pi and p2 in p 

tt{-,P3,M) = 7r(-;pi,AA) 
'^i■■,P3,^f) = a{-,p2,M). 

for some ps in p. □ 

Thus, if AA is a quantum cut for a family p = {p{0) : 6 £ Q) with p a one-to-one function 
then has the product form = \I' x $ and furthermore a{-; p{6),J\f) depends on 6 only 
through ip, and 7r(-; p{9),J\f) depends on 9 only through cj). 

Since a quantum instrument M is exhaustive for a parameterised set p = [p{6) : G 0) of 
states if the family cj(-; p{9),M) of posterior states does not depend on 0, exhaustive quantum 
instruments are quantum cuts of a special kind. They can be regarded as quantum analogues 
of sufficient statistics. At the other extreme are the quantum instruments for which the 
distributions 7r(-; p{9),M) do not depend on 6. These can be regarded as quantum analogues 
of ancillary statistics. 

Unlike exhaustivity, the concept of quantum sufficiency involves not only a quantum 
instrument but also a coarsening. The definition of quantum sufficiency can be extended to 
the following version involving parameters of interest. 

Definition 6 (Quantum sufficiency for interest parameters). Let p = {p{9) : 9 £ @) 
be a family of states and let : Q ^ ^ map G to the space ^ of interest parameters. A 
coarsening J\f' of a quantum instrument by a mapping T is said to be quantum sufficient 
for ■0 on p if 

(i) the measurement M' determined by trace(M'(-)p) = tt{-; p,M') is sufficient for the 
measurement M determined by trace(M(-)p) = 7r(-; p,M), with respect to the family p, 

(ii) for all 9i and 92 with ip{6i) = ■0(^2) and for all x in the sets a{x; p(9i),M) and 
a{T(x); p{92),M') of posterior states are inferentially equivalent. 

□ 

Consideration of the likelihood function obtained by applying a measurement to a pa- 
rameterised set of states suggests that the following weakening of the concept of inferential 
equivalence may be useful. 
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Definition 7 (Strong likelihood equivalence). Two parametric families of states p = 
{p{0) : 9 & Q) and cr = {cr{9) : 9 £ Q) on Hilbert spaces TC and /C respectively are said to be 
strongly likelihood equivalent if for every measurement M on 7i tliere is a measurement M' 
on /C witli tlie same outcome space, such that 

trace(M(dx)p(6')) _ trace(M'(da;)cr(6')) , 
tvace{M{dx)p{9')) ~ tvace{M'{dx)a{9')) ' ^" 

(whenever these ratios are defined) and vice versa. □ 

Thus the likelihood function of the statistical model obtained by applying M to p is 
equivalent to that obtained by applying M' to cr, for the same outcome of each instrument. 
Consideration of the distribution of the likelihood ratio leads to the following definition. 

Definition 8 (Weak likelihood equivalence). Two parametric families of states p = 
{p{9) : 9 £ Q) and cr = {cr{9) : 9 £ Q) on Hilbert spaces Ti and /C respectively are said to be 
weakly likelihood equivalent if for every measurement M onTi with outcome space X there is 
a measurement M' on fC with some outcome space 3^ such that the likelihood ratios 

ti&ce{M{dx)p{9)) trace(M'(dy)cj(e)) 

cincL 



trace(M(dx)p(6l')) trace(M'(dy)cj(6l')) 
have the same distribution for all 9,9' in 0, and vice versa. □ 

The precise connection between likelihood equivalence and inferential equivalence is not 
yet known but the following conjecture appears reasonable. 

Conjecture. Two quantum models are strongly likelihood equivalent if and only if they are 
inferentially equivalent up to quantum randomisation. 

6 Quantum and Classical Fisher Information 

In Section|31we showed how to express the Fisher information in the outcome of a measurement 
in terms of the quantum score. In this section we discuss quantum analogues of Fisher 
information and their relation to the classical concepts. 

6.1 Definition and First Properties 

Differentiating with respect to 9, writing P//0/0 for the derivative of the symmetric loga- 
rithmic derivative pjjQ of p, and using the defining equation (|12)) for p^g, we obtain 

= t^ace{p,e{9)piiB{9) + p{9)p//e/e{9)) 

= trace (i (/9(e)p//e(0) + p//e(^)p(e))p//e(^)) + trace(p(e)p//e/e(e)) 
= 7(0) -trace(p(0)J(0)), 

where 

7(0) = trace {p{9)p ,ie{9f) 
is the expected (or Fisher) quantum information, already mentioned in Sections El and 0J and 

J{e) = -P//e/e{e), 
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which we shah caU the observable quantum information. Thus 

I{e) = trace (p(0)J(0)) , 

which is a quantum analogue of the classical relation i{9) = Eg{j{6)) between expected and 
observed information (where j{9) = — //g/g(0)). Note that J{6) is an observable, just as j{9) 
is a random variable. 

Neither I{9) nor J{0) depends on the choice of measurement, whereas i{9) = i{9; M) does 
depend on the measurement M. Expected quantum information behaves additively, i.e. for 
parametric quantum models of states of the form p : 9 ^ Pi{9) ® ■ ■ ■ ® Pn{(^) (which model 
'independent particles'), the associated expected quantum information satisfies 

n 

2=1 

which is analogous to the additivity property of Fisher information. 

In the case of a multivariate parameter 9, the expected quantum information matrix 1(9) 
is defined in terms of the quantum scores by 

= itrace [p//e^ {9)p{9)p,/e, (9) + p//e, i9)p{0)p//e, (0)) ■ (26) 



6.2 Relation to Classical Expected Information 

Suppose that 9 is one-dimensional. There is an important relationship b etween expected quan- 
tum in formation 1(9) and classical expected information i{9; M), due to Braunstein and Caved 



namely that for any measurement M with density m with respect to a cr-finite measure 



v on X, 

i{9;M)<I{9), (27) 
with equality if and only if, for z^- almost all x, 

m{x)^/^p//0{9)p{9)^/'^ = r{x)m{xf'^ p[9f/^ , (28) 

for some real number r(x). 

For each 9, there are measurements which attain the bound in the quantum information 
inequality (|27j) . For instance, we can choose M such that each m{x) is a projection onto an 
eigenspace of the quantum score pjiQ{9). Note that this attaining measurement may depend 
on 9. 

Example 7 (Information for spin-half). Consider a spin-half particle in the pure state 
P = P{'n,0) = lV'(^,6'))(V'(r/,6')| given by 



l^(^,^))-^g.e/2gi^(^/2) ) 



sin(ry/2) / 

As we saw in Subsection l2.21 p can be written as p = {l + UxCFx + Uyay + UzCyz) /"^ = \{'^ + u-a), 
where a = {(7x,cry,az) are the three Pauli spin matrices and u = {ux,Uy,Uz) = u{7],9) is the 
point on the Poincare sphere S"^ with polar coordinates {t],9). Suppose that the colatitude r] 
is known and exclude the degenerate cases rj = or rj = tt; the longitude 9 is the unknown 
parameter. 
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Since all the p{9) are pure, one can show that p//g{0) = 2p/q{9) = u/Q{9)-a = sin(?7) u(tt/2, 9+ 
tt/2) ■ a. Using the properties of the Pauli matrices, one finds that the quantum information 
is 

I{9) = tvace{p{0)p//g{9f) = sin^ 77. 

Summarising some results fromlBarnd orff-Nielsen and we now discuss a condition 

that a measurement must satisfy in order for it to achieve this information. 

It follows from (|28|) that, for a pure spin-half state p = \■^lJ){^p\, a necessary and sufficient 
condition for a measurement to achieve the information bound is: for i/-almost all x, m{x) is 
proportional to a one-dimensional projector \(,{x)) {S,{x)\ satisfying 

{^{x)\2){2\a) = r{xmx)\l), 

where r{x) is real, |1) = |^/^), |2) = IV')"'" being a unit vector in orthogonal to 

and \a) = 2\ip) /g. It can be seen that geometrically this means that corresponds to a 

point on 5^ in the plane spanned by u{9) and u/g{6). 

If r] 7^ vr/2, then distinct values of 9 give distinct planes, and all these planes intersect in 
the origin only. Thus no single measurement M can satisfy I{0) = i{9;M) for all 6. On the 
other hand, if ry = 7r/2, so that the states p{9) lie on a great circle in the Poincare sphere, 
then the planes defined for each 9 are all the same. In this case any measurement M with all 
components proportional to projector matrices for directions in the plane 7] = tt/2 satisfies 
I{6) = i{9;M) for all 9 & @. In particular, any simple measurement in that plane has this 
property. 

More generally, a smooth one-parameter model of a spin-half pure state with everywhere 
positive quantum information admits a uniformly attaining measurement, i.e. such that 
I{6) = i{6] M) for all ^ G 0, if and only if the model is a great circle on the Poincare sphere. 
This is actually a quantum exponential transformation model, see Example El □ 

When the state p is strictly positive, and under further nondegeneracy conditions, essen- 
tially the only way to achieve the bound ()27() is through measuring the quantum score. In 
the discussion below we first keep the value of 6 fixed. Since any nonnegative self-adjoint 
matrix can be written as a sum of rank-one matrices (using its eigenvalue-eigenvector de- 
composition), it follows that any dominated measurement can be refined to a measurement 
for which each m{x) is of rank 1, thus ra{x) = r{x)\^{x)){^{x)\ for some real r{x) and state 
vector see the end of Section 12.61 If one measurement is the refinement of another, 

then the distributions of the outcomes are related in the same way. Therefore, under re- 
finement of a measurement. Fisher expected information cannot decrease. Therefore if any 
measurement achieves (|27|) . there is also a measurement with rank 1 components achiev- 
ing the bound. Consider such a measurement. Suppose that p is positive and that all the 
eigenvalues of pijQ are different. The condition ■m(x)^l'^ p n^p^l'^ = r{x)m{x)^l'^ p^l"^ is then 
equivalent to \'i{x)){^{x)\p jjg = r(a;)|^(x))(^(x)|, which states that ^{x) is an eigenvector of 
PjjQ. Since we must have / m[x)v{Ax) = 1, it follows that all eigenvectors of p//g occur in this 
way in components m{x) of M. The measurement can therefore be reduced or coarsened to 
a simple measurement of the quantum score, and the reduction (at the level of the outcome) 
is sufficient. 

Suppose now that the state p{9) is strictly positive for all 9, and that the quantum score 
has distinct eigenvalues for at least one value of 9. Suppose that a single measurement exists 
attaining ()27|) uniformly in 9. Any refinement of this measurement therefore also achieves the 
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bound uniformly; in particular, the refinement to components which are all proportional to 
projectors onto orthogonal one-dimensional eigenspaces of the quantum score at the value of 
9 where the eigenvalues are distinct does so. Therefore the eigenvectors of the quantum score 
at this value of 6 are eigenvectors at all other values of 9. Therefore there is a self-adjoint 
operator X with distinct eigenvalues such that p//g{9) = f{X;9) for each 9. Fix 6*0 and let 

F{X]9) = Jq^ f{X;9)d9. Let po = p{9q). If we consider the defining equation H12() as a 
differential equation for p{9) given the quantum score, and with initial condition p{9q) = p^, 
we see that a solution is p{9) = exp(iF(X; 9))pQ exp(iF(X; 9)). Under smoothness conditions 
the solution is unique. Rewriting the form of this solution, we come to the following theorem: 

Theorem 2 (Uniform attainability of quantum information bound). Suppose that 
the state is everywhere positive, the quantum score has distinct eigenvalues for some value 
of 9, and is smooth. Suppose that a measurement M exists with i{9;M) = I{9) for all 9, 
thus attaining the Braunstein-Caves information hound \21l\) uniformly in 9. Then there is 
an observable X such that a simple measurement of X also achieves the bound uniformly, 
and the model is of the form 

p{9) = c{9) exp(iF(X; 9))pQ exp(iF(X; 9)) (29) 

for a function F, indexed by 9, of an observable X where c{9) = l/trace{pQexp(F{X;9))), 
plI^iB) = f{X;9) -tTace{p{9)f {X;9)), and f{X;9) = F/g{X;9). Conversely, for a model of 
this form, a measurement of X achieves the bound uniformly. 

Remark 2 (Spin- half case). In the spin-half case, if the information is positive then the 
quantum score has distinct eigenvalues, since the outcome of a measurement of the quantum 
score always equals one of the eigenvalues, has mean zero, and positive variance. □ 

Theorem 3 (Uniform attainability of quantum Cramer Rao bound). Suppose that 
the positivity and nondegeneracy conditions of the previous theorem are satisfied, and suppose 
that for the outcome of some measurement M there is a statistic t such that, for all 9, t is 
an unbiased estimator of 9 achieving Helstrom's quantum Cramer-Rao bound \21]) . Var(t) = 
I{9)~^. Then the model is a quantum exponential model of symmetric type 

p{9) = c{9)eM¥T)p^eM\0T) 

for some observable T, and simple measurement of T is equivalent to the coarsening of M by 
t. 

Proof. The coarsening M' = M o t~^ by t of the measurement M also achieves the quan- 
tum information bound (|27j) uniformly; i.e. i{9;M') = I{9). Applying Theorem [21 to this 
measurement, we discover that the model is of the form ()29|). while (if necessary refining the 
measurement to have rank one components) t can be considered as a function of the outcome 
of a measurement of the observable X, and it achieves the classical Cramer~Rao bound for 
unbiased estimators of 9 based on this outcome. The density of the outcome (with respect to 
counting measure on the eigenvalues of X) is found to be c{9) exp(F(x; 9))ticace{poIlx) where 
n(x) is the projector onto the eigenspace of X corresponding to eigenvalue x. Hence, up to 
addition of functions of or x alone, F{x; 9) is of the form 9t[x). □ 
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The basic inequahty ()27() holds also when the dimension of 9 is greater than one. In 
that case, the quantum information matrix I[0) is defined in H26|) and the Fisher information 
matrix i{9] M) is defined by 

irs{e;M) = Ee{lr{0)lsm, 

where Ir denotes IjQr etc. Then (|27|) holds in the sense that I{6) — i{9;M) is positive semi- 
definite. The inequality is sharp in the sense that I{9) is the smallest matrix dominating all 
i{9;M). However it is typically not attainable, let alone uniformly attainable. 

Theorem |21 can be generalised to the case of a vector parameter. This also leads to a 
generalisation of Theorem EJ which is the content of Corollary^ below. 

Theorem 4. Let {p{9) : S 0) he a twice differentiable parametric quantum model. If 

(i) there is a measurement M with i{9; M) = 1(9) for all 9, 

(a) p{9) is positive for all 9, 

(Hi) Q is simply connected 

then, for any 9q in Q, there are an observable X and a function F (possibly depending on 9q) 
such that 

p{9) = exp [\F{X- 9)) p{9o) exp 9)) . 

Corollary 1. //, under the conditions of Theorem^ there exists an unbiased estimator t of 9 
based on the measurement M achieving \21\) . then the model is a quantum exponential family 
of symmetric type {11^ with commuting T^. 

Ver sions of these results have b een known for some time; see lYouii3 dlQTfj'l. Fuiiwara and Nagaokal 



Ti^v l Amari and ^...^ U »; compare especially our Cor" d^ESTSNSS 



(1200(1. Theorem 7.6 ), and our Theorem0]to parts (I)-(IV) of the subsequent outlined proof in 



Amari and Naeaoka QOOO). Unfortunately, precise regularity conditions and detailed proofs 



seem to be available elsewhere only in some earlier publications in Japanese. Note that we 
have obtained the same conclusions, by a different proof, in the spin-half pure state case. 
Example [3 This indicate s that a more gene ral result is possible without the hypothesis of 
positivity of the state. See Matsumotol (|2002l ) for important new work on the pure state case. 



The symmetric logarithmic derivative is not the unique quantum analogue of the classical 
statistical concept of score. Other analogues include the right, left and balanced derivatives 
obtained from suitable variants of ()12|) . Each of these gives a quantum information ineq uality 
and a quantum Cramer-Rao bound analogous to ^ and (j2ll). See iRelavkinI ()l97(ih . and 



(as yet) unpublished new work by this author. There is no general relationship between the 
various quantum information inequalities when the dimension of 9 is greater than one. 

Asym ptotic optimality theor y for quantum esti mation has only just started to be devel- 



Asym ptotic optimaiity tneor y tor quantum estii nation nas only lust started to be devel- 
oped; se elrril] a,nd Massa.rl (Ejool) . loiiil (j2001hl ). and lKevl and Werner. (,2001. ): for an applica- 



tion see 



Hannemann et al 



l2iMj),L 

mm- 



7 Classical versus Quantum 

This section makes some general comments on the relation between classical and quantum 
probability and statistics. This has been a matter of heated controversy ever since the discov- 
ery of quantum mechanics. It has mathematical, physical, and philosophical ingredients, and 
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much confusion, if not controversy, has been generated by problems of interdisciplinary com- 
munication between mathematicians, physicists, philosophers and mor e recently s tatist icians. 
Authorities from both physics and mathematics, perhaps starting with FevnmanI (jl95ll ). have 
promoted vigorously the standpoint that 'quantum probability' is something very different 
from 'classical probability'. Malley and Hornstein (1993) conclude from a perceived conflict 
between classical and qua ntum probability that 'quantum statistics' should be set apart from 
classical statistics. Even William^ ( 2001 ) states that Nature chooses a different model for 
probability for the quantum world than for the classical world. 

In our opinion, though important mathematical and physical facts lie at the root of these 
statements, they are misleading, since they seem to suggest that quantum probability and 
quantum statistics do not belong to the field of classical probability and statistics. However, 
quantum probabilities have the same meaning (whether you are a Bayesian or a frequentist) 
as classical probabilities, and statistical inference problems from quantum mechanics fall 
squarely in the framework of classical statistics. The statistical design problems are special 
to the field. 

Our stance is that the predictions which quantum mechanics makes of the real world are 
stochastic in nature. A quantum physical model of a particular phenomenon allows one to 
compute probabilities of all possible outcomes of all possible measurements of the quantum 
system. The word 'probability' means here 'relative frequency in many independent rep- 
etitions'. The word 'measurement' is meant in the broad sense of 'macroscopic results of 
interactions of the quantum system under study with the outside world'. These predictions 
depend on a summary of the state of the quantum system. The word 'state' might suggest 
some fundamental property of a particular collection of particles, but for our purposes all we 
need to understand under the word is 'a convenient mathematical encapsulation of the infor- 
mation needed to make any such predictions'. Some physicists argue that it is meaningless 
to talk of the state of a particular particle, one can only talk of the state of a large collection 
of particles prepared in identical circumstances; this is called a statistical ensemble. Others 
take the point of view that when one talks about the state of a particular quantum system 
one is really talking about a property of the mechanism which generated that system. Given 
that quantum mechanics predicts only probabilities, as far as real-world predictions are con- 
cerned the distinction between on the one hand a property of an ensemble of particles or of a 
procedure to prepare particles, and on the other hand a property of one particular particle, 
is a matter of semantics. However, if one would like to understand quantum mechanics by 
somehow finding a more classical (intuitive) physical theory in the background which would 
explain the observed phenomena, this becomes an important issue. It is also an issue for 
cosmology, when there is only one closed quantum system under study: the universe. At this 
level there is a remarkable d ifference be tween classical and quantum probabilities: according 
to the celebrated theorem of lBelll it is impossible to derive the probabilities described 

in quantum mechanics by an underlying deterministic theory from which the probabilities 
arise 'merely' as the reflection of statistical variation in the initial conditions, unless one ac- 
cepts grossly unphysical nonlocality in the 'hidden variables'. Thus quantum probabilities are 
fundamentally irreducible, in contrast to every other physical manifestation of randomness 
known to us. 

It follows from our standpoint that 'quantum statistics' is classical statistical inference 
about unknown parameters in models for data arising from measurements on a quantum 
system. However, just as in biostatistics, geostatistics, etc., many of these statistical problems 
have a common structure and it pays to study the core ideas and common features in detail. 
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As we have seen, this leads to the introduction of mathematical objects such as quantum 
score, quantum expected information, quantum exponential family, quantum transformation 
model, quantum cuts, and so on; the names are deliberately chosen because of analogy and 
connections with the existing notions from classical statistics. 

Already at the level of probability (i.e. before statistical considerations arise) one can 
see a deep and fruitful analogy between the mathematics of quantum states and observables 
on the one hand, and classical probability measures and random variables on the other. 
Note that collections of random variables and collections of operators can both be endowed 
with algebraic structure (sums, products, ...). It is a fact that from an abstract point 
of view a basic structure in probability theory — a collection of random variables X on a 
countably generated probability space, together with their expectations f XdP under a given 
probability measure P — can be represented by a (commuting) subset of the set of self-adjoint 
operators Q on a separable Hilbert space, together with the expectations ticace{pQ) computed 
using the trace rule under a given state p. Thus a basic structure in classical probability 
theory is isomorphic to a special case of a basic structure in quantum probability. 'Quantum 
probability', or 'noncommutative probability theory' is the name of the branch of mathematics 
which takes as its starting point the mathematical structure of states and observables in 
quantum mechanics. From this mathematical point of view, one may claim that classical 
probability is a special case of quantum probability. The claim does entail, however, a rather 
narrow (functional analytic) view of classical probability. Moreover, many probabilists will 
feel that abandoning commutativity is throwing away the baby with the bathwater, since 
this broader mathematical structure has no analogue of the sample outcome uj, and hence no 
opportunity for a probabilist's beloved probabilistic arguments. 

As statisticians, we would like to argue (tongue in cheek) that quantum probability is 
merely a special case of classical statistics. A quantum probability model is determined by 
specifying the expectations of every observable. This is equivalent to specifying a family of 
classical probability models: namely the joint probability distribution of the measurements 
of every commuting subset of observables. The basic structure of quantum probability is 
mathematically equivalent to a particular case of the basic structure of classical statistical 
inference — namely, an indexed family of probability models. 



8 Other Topics 

There are many further topics in quantum physics where more extensive use of knowledge 
and techniques from classical statistics and probability seems likely to lead to substantial 
scientific advances. However, classical concepts and results from the latter fields will often 
need considerable modification or recasting to be suitable and relevant for the quantum world, 
as is exemplified in parts of the previous Sections. 

Here we shall indicate briefly a few of the topics. The selection of these is motivated 
mainly by our own current interests rather than by an aim to be in some way representative 
of the broad picture. However, the topics listed are all subject to considerable developments 
in the current literature. For more detailed accounts, with references to the physics and 
mathematics literature, see lRa.rndorfF-Nielsen. Oil], and .TunrJ «). 



8.1 Quantum Tomography 

In its simplest form, the problem of quantum tomography is as follows. 
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The simple harmonic oscihator is the basic model for the motion of a quantum particle 
in a quadratic potential well on the real line. Precisely the same mathematical structure 
describes oscillations of a single mode of an electromagnetic field (a single frequency in one 
direction in space) . In this type of structure one considers the quadrature observable at phase 
(j), given by X^p = Q cos (p+Psmcj), where Q and P are the position and momentum operators. 
Here, the underlying Hilbert space Ti. is infinite dimensional and the operators Q and P can 
be characterised abstractly by the commutation relation = ^^1- 

Given independent measurements of X^j,, with (p drawn repeatedly at random from the 
uniform distribution on (0, 2tt], the aim is to reconstruct the unknown state p of the quantum 
system. In statistical terms, we wish to do nonparametric estimation of p from n independent 
and identically distributed observations a;,), with (pi as just described and Xi from the 
measurement of X^. . In quantum optics, measuring a single mode of an electromagnetic field 
in what is called a quantum homodyne experiment, this would be the appropriate model with 
perfect photodetectors. In practice, ind ependent Gaussia n noise should be added. 

Som e key r eferences are the book Leonhardt (1997) and the survey papers D 'Arianol 
(jl997al lbL bnnil ) . Of special interest is a maxim um likelihood b ased approach to the problem 



that has been taken in recent work by Banas zek et al. ( 20oO) . 



We think that it is a major 
open problem to work out the asymptotic theory of this method, taking account of data- 
driven truncation, and possibly alleviating the problem of the large parameter-space by using 
Bayesian methods. The method should be tuned to the estimation of various functionals of 
p of interest, and should provide standard errors or confidence intervals. 



8.2 Quantum Stochastic Processes and Continuous-Time Measurements 

In this paper we have focussed directly on questions of quantum statistical inference. Of major 
related importance are the areas of quantum stochastic processes and continuous-time quan- 
tum measurements. These are currently undergoing rapid developments, and the concept of 
quantum instruments, discussed abov e, has a key role in parts of this. Refere nces to much of 



this work are available inQJMl-ealso the more Ex tensive account bv 



. iBelavkinI |2Qq3), and BarndorfF-Nielsen and LoubenetsI 



Perciyall ()l99Sl V iHolevnl (|2 

tooi ). 

There are quantum analogues of Brownian motion and Poisson processes, and more gener- 
ally of Levy processes, and a quantum stochastic analysis based on these. Interesting combina- 
tions of classical and quantum stochastic analysis occur in a variety of con texts, for instance in 
Monte Carlo simulation studies of the Markov quantum master equation; Molmer and CastinI 
is an important early reference. The Markov quantum master equation is important 
particularly in quantum optics which is one of the currently most active and exciting fields 
of quantum physics. 

Other, mainly mathematically motivated, studies have strongly algebraic elements, such 
as in free probability. In this context a va riety of 'independence' concepts have turned u 
wit h associated Lew processes, etc. See iBarndorfF-Nielsen and ThorbiOrnsenI hQri2iiC_ 
and |Franz et~ZI 

and references given there. Note, moreover, that lBiane and Speichen 



( 200 j ) discuss a concept of free Fisher information. 
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8.3 Quantum Tomography of Operations 



We have focussed on quantum statistical models where only the state depends on an unknown 
parameter. Of great interest is also the situation where an unknown operation acts on a known 
state. 

Consider a quantum instrument J\f which produces no data but simply converts an input 
state p into an output state a{p;J\f). By the general theory, a{p;M) = Yli''^iP''^i some 
collection of matrices Ui satisfying "^n^ni = 1. This representation is not unique but one 
can fix the Ui by making some identifying restrictions. One could then proceed to estimate 
the Ui by feeding the instrument with sufficiently many different input states p, many times, 
each time carrying out sufficiently many different measurements on the output state. 

It has recently been discovered that there is an extremely effective short cut to this 
procedure. Consider two copies of the original quantum system, supposed here to be of 
dimension d. Consider the maximally entangled state l^') = \j) \j)/y/d on the product 
system. Now allow the instrument M to act on the first component of the product system, 
while the second component is left unchanged. The output state, also on the product system, 
has density matrix (t(| (^'| ; M®I) where I denotes the identity instrument. It turns out that 
the output state completely characterizes M. In fact, there is a one-to-one correspondence 
between on the one hand completely positive data-less instruments N , and on the other hand 
density matrices on the product system such that the reduced density matrix of the second 
component is the completely mixed state \j){3\/d (i.e., the same as the reduced density 
matrix of the second component initially, which is left unchanged by the procedure). 

Thus one does not need to probe the instrument with many different input states, but 
can effectively probe it with all inputs simultaneously, by exploiting quantum entanglement 
with an auxiliary system. The problem has been converted into a quantum statistical model 
a{\'^){'^\]M ®T) with parameter being the unknown instrument A/" . 

This procedure has been pioneered by iD'Ariano and Lo Prestil (|2nnih and has already 
been exploited experimentally. 



8.4 Conclusion 

This paper has, in brief form, presented our present view of a role for statistical inference 
in quantum physics. We are keenly aware that many relevant parts of quantum physics and 
quantum stochastics have not been reviewed, or have only been touched upon. 
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